First thing, load the libraries that will be used.
The first code being used looks at data for college faculty salaries and total compensation from 1995, by Ranks, Tier, and State from 1995. The ranks are Assistant, Associate, and Full. The tier invovled the funding devoted to research vs the amount of funding for teaching; tier I spend mor on research than teaching and award PhD degrees.
Load and clean file:
df <- read.csv("./FacultySalaries_1995.csv")
names(df) <- make_clean_names(names(df))
names(df)
## [1] "fed_id" "univ_name" "state"
## [4] "tier" "avg_full_prof_salary" "avg_assoc_prof_salary"
## [7] "avg_assist_prof_salary" "avg_prof_salary_all" "avg_full_prof_comp"
## [10] "avg_assoc_prof_comp" "avg_assist_prof_comp" "avg_prof_comp_all"
## [13] "num_full_profs" "num_assoc_profs" "num_assist_profs"
## [16] "num_instructors" "num_faculty_all"
df <- df %>%
pivot_longer(c(avg_full_prof_salary,avg_assoc_prof_salary, avg_assist_prof_salary),
names_prefix = "_salary",
values_to = "salary",
names_to = "rank")
df <- df[!df$tier %in% "VIIB", ]
Creation of figure 1:
The ANOVA model tests the influence of “State”, “Tier”, and “Rank” on “Salary.”
aov.df <- aov(salary ~ tier + rank + state, data = df)
summary(aov.df)
## Df Sum Sq Mean Sq F value Pr(>F)
## tier 2 8119005 4059503 1116.49 <2e-16 ***
## rank 2 16521237 8260619 2271.92 <2e-16 ***
## state 50 4625868 92517 25.45 <2e-16 ***
## Residuals 3297 11987752 3636
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 128 observations deleted due to missingness
a <- summary(aov.df)
capture.output(a, file= "Salary_ANOVA_Summary.txt")
This data is from a collaboration between Young Living Inc. and UVU Microbiology. A number of dead cedar trees were collected and the chemical composition of their essential oil content was measured. The hypothesis was that certain chemicals would degrade over time since they died in fires.
Clean data with chemical compounds:
df2 <- read.csv("./Juniper_Oils.csv")
names(df2)
## [1] "SampleID" "Project" "Amplicon"
## [4] "Tree_Species" "BurnYear" "Latitude"
## [7] "Longitude" "Field_Office" "BLM_Fire_Name"
## [10] "Tracking_Number" "alpha.pinene" "para.cymene"
## [13] "alpha.terpineol" "cedr.9.ene" "alpha.cedrene"
## [16] "beta.cedrene" "cis.thujopsene" "alpha.himachalene"
## [19] "beta.chamigrene" "cuparene" "compound.1"
## [22] "alpha.chamigrene" "widdrol" "cedrol"
## [25] "beta.acorenol" "alpha.acorenol" "gamma.eudesmol"
## [28] "beta.eudesmol" "alpha.eudesmol" "cedr.8.en.13.ol"
## [31] "cedr.8.en.15.ol" "compound.2" "thujopsenal"
## [34] "Yield_percent" "Bolt_Surface_Area_cm2" "Raw_Exit_Holes_per_cm2"
## [37] "Raw_Exit_Holes" "Living_Larvae" "ChemTotal"
## [40] "ChemMean" "YearsSinceBurn"
df2 <- df2 %>%
pivot_longer(c("alpha.pinene","para.cymene","alpha.terpineol","cedr.9.ene","alpha.cedrene",
"beta.cedrene","cis.thujopsene","alpha.himachalene","beta.chamigrene",
"cuparene","compound.1","alpha.chamigrene","widdrol","cedrol","beta.acorenol",
"alpha.acorenol","gamma.eudesmol","beta.eudesmol","alpha.eudesmol",
"cedr.8.en.13.ol","cedr.8.en.15.ol","compound.2","thujopsenal"),
values_to = "Concentration",
names_to = "ChemicalID")
Chemicals of significance (significant, as in P < 0.05) affected by “Years Since Burn.”
mod1 <- glm(data = df2, Concentration ~ YearsSinceBurn * ChemicalID)
#also still working on
Terminal -> git add Exam 3 git commit -m “Exam 3” git push My folder for Exam 3 is not on GitHub.